Identifying Phrasal Verbs Using Many Bilingual Corpora

نویسندگان

  • Karl Pichotta
  • John DeNero
چکیده

We address the problem of identifying multiword expressions in a language, focusing on English phrasal verbs. Our polyglot ranking approach integrates frequency statistics from translated corpora in 50 different languages. Our experimental evaluation demonstrates that combining statistical evidence from many parallel corpora using a novel ranking-oriented boosting algorithm produces a comprehensive set of English phrasal verbs, achieving performance comparable to a human-curated set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grammarless Extraction of Phrasal Translation Examples from Parallel Texts

We describe a method for identifying subsentential phrasal translation examples in sentencealigned parallel corpora, using only a probabilistic translation lexicon for the language pair. Our method differs from previous approaches in that (1) it is founded on a formal basis, making use of an inversion transduction grammar (ITG) formalism that we recently developed for bilingual language modelin...

متن کامل

Identifying Text Genres Using Phrasal Verbs

Understanding the textual distinction between spokenness-informality and writtenness-formality serves many purposes. It can facilitate text mining, improve parser accuracy, offer better appraisals of student writing, and may also facilitate better interpretations of experimental data. Previous studies of such textual variation (e.g., Biber, 1988, Louwerse et al., 2004) have failed to produce a ...

متن کامل

The Effect of Conceptual Metaphor Awareness on Learning Phrasal Verbs by Iranian Intermediate EFL Learners

The ability to comprehend and produce phrasal verbs, as lexical chunks or groups of words which are commonly found together, is an important part of language learning. This study investigates the effect of ‘conceptual metaphor awareness’, as a newly developed technique in Cognitive Linguistics, on learning phrasal verbs by Iranian intermediate EFL learners. To meet this objective, two intact ho...

متن کامل

The Comparative Effect of Visual vs. Auditory Input Enhancement on Learning Non-Congruent Phrasal Verbs by Iranian EFL Learners

Vocabulary is one of the essential components of language and learning phrasal verbs as part of vocabulary is quite challenging for foreign language learners. The present study aimed at investigating the effects of visual and auditory input enhancement on learning non-congruent phrasal verbs. The participants of the study were 90 intermediate English language learners who were divided into two ...

متن کامل

Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval

The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, phrasal translation as well as evaluations on Cross-Language Information Retrieval. A two-stages translation model is proposed for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives according to their...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013